intelligent virtual agent
9th Workshop on Sign Language Translation and Avatar Technologies (SLTAT 2025)
Nunnari, Fabrizio, Jiménez, Cristina Luna, Wolfe, Rosalee, McDonald, John C., Filhol, Michael, Efthimiou, Eleni, Fotinea, Evita, Hanke, Thomas
The Sign Language Translation and Avatar Technology (SLTAT) workshops continue a series of gatherings to share recent advances in improving deaf / human communication through non-invasive means. This 2025 edition, the 9th since its first appearance in 2011, is hosted by the International Conference on Intelligent Virtual Agents (IVA), giving the opportunity for contamination between two research communities, using digital humans as either virtual interpreters or as interactive conversational agents. As presented in this summary paper, SLTAT sees contributions beyond avatar technologies, with a consistent number of submissions on sign language recognition, and other work on data collection, data analysis, tools, ethics, usability, and affective computing.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > Canada (0.05)
- (11 more...)
Estuary: A Framework For Building Multimodal Low-Latency Real-Time Socially Interactive Agents
Lin, Spencer, Rizk, Basem, Jun, Miru, Artze, Andy, Sullivan, Caitlin, Mozgai, Sharon, Fisher, Scott
The rise in capability and ubiquity of generative artificial intelligence (AI) technologies has enabled its application to the field of Socially Interactive Agents (SIAs). Despite rising interest in modern AI-powered components used for real-time SIA research, substantial friction remains due to the absence of a standardized and universal SIA framework. To target this absence, we developed Estuary: a multimodal (text, audio, and soon video) framework which facilitates the development of low-latency, real-time SIAs. Estuary seeks to reduce repeat work between studies and to provide a flexible platform that can be run entirely off-cloud to maximize configurability, controllability, reproducibility of studies, and speed of agent response times. We are able to do this by constructing a robust multimodal framework which incorporates current and future components seamlessly into a modular and interoperable architecture.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)
The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents
Chang, Che-Jui, Sohn, Samuel S., Zhang, Sen, Jayashankar, Rajath, Usman, Muhammad, Kapadia, Mubbasir
Previous studies regarding the perception of emotions for embodied virtual agents have shown the effectiveness of using virtual characters in conveying emotions through interactions with humans. However, creating an autonomous embodied conversational agent with expressive behaviors presents two major challenges. The first challenge is the difficulty of synthesizing the conversational behaviors for each modality that are as expressive as real human behaviors. The second challenge is that the affects are modeled independently, which makes it difficult to generate multimodal responses with consistent emotions across all modalities. In this work, we propose a conceptual framework, ACTOR (Affect-Consistent mulTimodal behaviOR generation), that aims to increase the perception of affects by generating multimodal behaviors conditioned on a consistent driving affect. We have conducted a user study with 199 participants to assess how the average person judges the affects perceived from multimodal behaviors that are consistent and inconsistent with respect to a driving affect. The result shows that among all model conditions, our affect-consistent framework receives the highest Likert scores for the perception of driving affects. Our statistical analysis suggests that making a modality affect-inconsistent significantly decreases the perception of driving affects. We also observe that multimodal behaviors conditioned on consistent affects are more expressive compared to behaviors with inconsistent affects. Therefore, we conclude that multimodal emotion conditioning and affect consistency are vital to enhancing the perception of affects for embodied conversational agents.
- Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.14)
- Oceania > Australia > New South Wales > Sydney (0.05)
- North America > United States > New Jersey (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Education (0.93)
- Government > Regional Government (0.67)
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Nyatsanga, Simbarashe, Kucherenko, Taras, Ahuja, Chaitanya, Henter, Gustav Eje, Neff, Michael
Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (10 more...)
- Research Report (1.00)
- Overview (1.00)
- Education (1.00)
- Media (0.67)
- Leisure & Entertainment > Games > Computer Games (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)
2023's Top 4 AI Use Cases in Healthcare Communications
I recently had a scare during the holidays when my octogenarian father, visiting from out of town, fell in the kitchen during our Christmas Eve gathering. My heart skipped a beat at that moment and 10 things ran through my mind about what to do next, like wishing we had a doctor in the family! So going to the ER was not the answer. However, we had concerns for him, and I needed some peace of mind. Could I give him Tylenol, or would that interfere with his current medicines?
How is AI Revolutionizing the Telecommunications Industry?
Robotic Process Automation (RPA): Robotic Process Automation is a technology that configures computer software to capture data and manipulate applications in the way it is done by humans. With RPA telecommunication providers can automate back-end activities such as data entry, reconciliation, or validation, streamline customer support as well as perform cross-sell and up-sell utilizing AI-powered assisted calls. RPA applications allow CSPs to reduce costs, enhance accuracy, improve efficiency and deliver a better customer experience. Intelligent Virtual Agents: Intelligent Virtual Agents based on AI technologies gain traction in the telecommunication sector, resulting in improved customer experience and satisfaction. Telecom providers have turned to virtual assistance to optimize the processing of the huge number of support requests for troubleshooting, billing inquiries, maintenance, device settings, etc. AI-powered assistants handle all service-type questions and process transactions efficiently and at high speed.
- Telecommunications > Networks (0.72)
- Information Technology > Networks (0.72)
The making of an intelligent virtual agent (IVA)
For years, businesses have sought to provide customers with more self-service options and increase automation rates in their contact centers using speech-enabled interactive voice response systems (IVRs). They have also invested heavily in developing web chatbots. However, these systems were complicated to develop and required organizations to purchase, host, and manage a vast array of software, hardware, and equipment. Applications were also created in silos, requiring multiple development projects while making it difficult for applications to share data and context. A number of disruptive innovations have made it easier and more affordable to deploy AI-and-speech-enabled self-service.
OnviSource Releases its New Multi-Engine and Proprietary Artificial In
OnviSource announced today it has started the deployment of its AI-driven solutions powered by its new proprietary Artificial Intelligence software, called iMachine . Company's solutions are able to utilize the most optimized AI engine pertinent to their specific application. For example, Company's Intelligent Virtual Agent or smart bot, called Liaa, primarily utilizes iMachine's NLP/NLU engine; while Intellecta multichannel analytics and Automata RPA products may use iMachine's ML and DL engines for a variety of their AI-driven features. Use of iMachine by Company's solutions in analytics, RPA and IVA significantly enhances their capabilities in effectively addressing today's enterprise and contact center challenges in workforce optimization, customer experience management and business process automation; as well as automating the management of enterprise contents. Content of calls, audio files, email, chat, text, and structured or unstructured documents can be analyzed by iMachine for discovering intent, purpose, compliance, categories, sentiment, root causes and complex information otherwise undetected by analytics that do not use AI engines.
- North America > United States > Texas > Collin County > Plano (0.05)
- North America > United States > Oklahoma (0.05)
- Information Technology > Artificial Intelligence > Robots (0.91)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.78)
- Information Technology > Data Science > Data Mining > Big Data (0.41)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)
Inference Solutions Enables Conversational AI for Cisco's On-Premise Platforms -- Inference Solutions
Enterprises using Cisco's UC and CC offerings can upgrade their IVRs to Intelligent Virtual Agents without "ripping and replacing" existing software and equipment SAN FRANCISCO, CA – (September 17, 2019) – Inference Solutions, a global provider of Intelligent Virtual Agents (IVAs) for sales and service organizations, today launched new solutions that extend the self-service capabilities of Cisco Unified Communications Manager (UCM), Unified Contact Center Enterprise and Unified Contact Center Express (UCCE/X). Enterprises using these on-premise solutions can now easily upgrade their existing IVRs with cloud-based virtual agents powered by conversational AI. With Inference, IT organizations do not need to "rip and replace" their existing Cisco platforms to move their IVRs to the cloud. They can continue using UCM and UCCE/X while deploying cloud-based virtual agents managed by Inference, enjoying the benefits of an advanced self-service solution without installing the software, hardware or equipment required to run it. When organizations are ready to move their on-premise Cisco users to a cloud-based solution, their virtual agents will easily make that transition.